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A consensus sequence of the Feline coronavirus (FCoV) (strain FIPV WSU-79/1146) genome 
was determined from overlapping cDNA fragments produced by RT-PCR amplification of viral RNA. 
The genome was found to be 29 125 nt in length, excluding the poly(A) tail. Analysis of the 
sequence identified conserved open reading frames and revealed an overall genome organization 
similar to that of other coronaviruses. The genomic RNA was analysed for putative cis-acting 
elements and the pattern of subgenomic mRNA synthesis was analysed by Northern blotting. 
Comparative sequence analysis of the predicted FCoV proteins identified 16 replicase proteins 
(nsp1—nsp16) and four structural proteins (spike, membrane, envelope and nucleocapsid). Two 
mRNAs encoding putative accessory proteins were also detected. Phylogenetic analyses 
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confirmed that FIPV WSU-79/1146 belongs to the coronavirus subgroup G1-1. These results 
confirm and extend previous findings from partial sequence analysis of FCoV genomes. 


Coronaviruses are enveloped, positive-strand RNA viruses 
that are associated mainly with enteric or respiratory disea- 
ses in humans, companion animals and livestock. Corona- 
virus particles contain a genomic RNA of approximately 
27 000-30 000 nt and four structural proteins: namely, the 
spike glycoprotein S, the membrane protein M, the envelope 
protein E and the nucleocapsid protein N (Siddell et al., 
2005). In the infected cell, coronavirus gene expression starts 
with the translation of the replicase gene. The replicase gene 
comprises two large open reading frames (ORFs), desig- 
nated ORFla and ORF1b. The upstream ORF 1a encodes a 
polyprotein of approximately 450-500 kDa, termed poly- 
protein ppla, whereas ORFla and ORF1b together encode 
a polyprotein of 750-800 kDa, termed polyprotein pplab 
(Siddell et al., 2005). The pplab polyprotein is synthesized 
by a (—1) ribosomal frameshift during translation of the 
genomic RNA (Brierley, 1995; Thiel et al., 2003). The poly- 
proteins ppla and pplab are then processed by virus- 
encoded proteinases to generate 15-16 end products 
[replicase or non-structural proteins (nsp)] and an unknown 
number of intermediate products (Ziebuhr et al., 2000). The 
replicase proteins assemble to form the membrane-bound 
replication-transcription complex in the cytoplasm of the 
infected cell (Gosert et al., 2002; Prentice et al., 2004). 


The coronavirus replication—transcription complex med- 
iates replication of the genomic RNA and transcription of 


The GenBank/EMBL/DDBJ accession number for the genomic 
sequence of FCoV strain FIPV WSU-79/1146 determined in this 
study is DQ010921. 


Supplementary figures and tables are available in JGV Online. 


multiple subgenomic mRNAs. Coronavirus transcription is 
a complex process involving the discontinuous synthesis of 
up to eight (—)-strand RNAs of subgenomic size, which 
contain sequences corresponding to the 5’ and 3’ ends of 
the genome and serve as templates for the synthesis of 
subgenomic mRNAs (Sawicki & Sawicki, 1990; Spaan et al., 
1983). Important elements in the transcription process 
are the ‘transcription-regulatory sequence elements’ (TRS 
elements), which determine the fusion sites of leader and 
body-derived sequences of the subgenomic RNAs. The 
number of TRS elements correlates with the number of 
subgenomic mRNAs produced by a particular corona- 
virus. The subgenomic mRNAs express both structural and 
accessory proteins. 


Feline coronavirus (FCoV) infection is extremely common 
in cats, and especially in kittens. For example, in the UK, 
approximately 40% of the domestic cat population has 
been infected. In multi-cat households, this figure increases 
to about 90 % (Addie, 2000; Addie & Jarrett, 1992; Sparkes 
et al, 1992). Natural infections with FCoV are usually 
transient, although a significant percentage of infections 
may become persistent (Addie & Jarrett, 2001). Infections 
may be asymptomatic or may result in mild, self-limiting 
gastrointestinal disease. In these cases, the causative agent 
is known as feline enteric coronavirus (FECV). In a small 
percentage of animals, a fatal, multisystemic, immune- 
mediated disease occurs and this is known as feline infec- 
tious peritonitis (FIP) (Pedersen, 1995). The virus asso- 
ciated with FIP is referred to as feline infectious peritonitis 
virus (FIPV). It is proposed that cats acquire FIPV by 
mutation of an endogenous FECV (Poland et al., 1996; 
Vennema et al., 1998) or, rarely, through excreted virus 
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from other FIPV-infected animals (Watt et al., 1993). Any 
genetic difference(s) between FECV and FIPV that can 
account for their different pathogenicity remain to be 
identified. 


FIPV WSU-79/1146 (P100) was obtained from the ATCC 
(VR-2202). The virus was plaque-purified and propagated 
in Crandell—Reese feline kidney (CrFK) cells, and viral 
poly(A)-containing RNA was isolated from infected cells by 
using TRIzol reagent and Dynabeads oligo (dT)25 (Thiel 
et al., 1997). Published sequence data for FIPV WSU-79/ 
1146 (GenBank accession no. AY204704) and other FCoV 
strains (Herrewegh et al., 1998) were used to design primers 
to amplify and sequence overlapping PCR products spann- 
ing the whole genome length (see Supplementary Table S1, 
available in JGV Online). 


The genomic sequence of FCoV strain FIPV WSU-79/1146 
comprises 29125 nt, excluding the 3’ poly(A) tail. The 
sequence has been deposited in GenBank (accession no. 
DQ010921). The 5’ untranslated region (UTR) comprises 
311 nt and includes an ORF of four codons (nt 117-128) 
that lies within a putative stem-loop structure [nt 102-140; 
see Supplementary Fig. S1(a), available in JGV Online] that 
is similar to the stem-loop III structure that has been iden- 
tified as a cis-acting element in bovine coronavirus (BCoV) 
defective interfering RNA replication (Raman et al., 2003). 
Also within this region, it is possible to identify another 
putative secondary structure, the so-called ‘leader-TRS 
hairpin’ or LTH [nt 65-128; Supplementary Fig. $1(b)] (Van 
Den Born et al., 2004). The LTH structure encompasses 
the sequence 5’-CUAAAC-3’ (nt 93-98), which represents 
the core of the FIPV TRS element (de Groot et al., 1988) 
and defines the fusion sites of leader and ‘body’-derived 
sequences in coronavirus subgenomic mRNAs. The 3’ UTR 
of FIPV WSU-79/1146 would also be expected to contain 
cis-acting sequences and structural elements involved in 
viral RNA replication. In our analysis, we were able to 
identify two putative structures, spanning nt 28842-28964 
(see Supplementary Fig. $2, available in JGV Online), that 
bear striking resemblance to the bulged stem—loop-— 
pseudoknot structures identified by Masters and colleagues 
for MHV-A59 (Goebel et al., 2004). 


Analysis of the FIPV WSU-79/1146 genomic sequence with 
the NCBI graphical analysis tool ORF Finder identifies 
six ORFs that can be deduced to encode the non-structural 
and structural proteins of the virus (see Supplementary 
Table 82, available in JGV Online). ORFla (nt 312—12208) 
and ORF1b (nt 12164-20209) encode the non-structural 
proteins. These ORFs overlap by 46 nt and a typical 
coronavirus ‘slip site’, 5‘-UUUAAAC-3’ (nt 12173-12179), 
is located within this overlap. Adjacent and downstream 
of the ‘slip site’ is a putative ‘pseudoknot’ structure (see 
Supplementary Fig. $3, available in JGV Online). The ‘slip 
site’ and ‘pseudoknot’ are elements required for pro- 
grammed (—1) ribosomal frameshifting during translation 
of the coronavirus genomic RNA (Brierley, 1995). In the 
case of FIPV WSU-79/1146, this results in the expression of 


two primary translation products, ppla and pplab, that 
are predicted to have molecular masses of 441-3 and 
742-7 kDa, respectively. The ORFs encoding structural 
proteins are ORF S (nt 20206-24564), ORF E (nt 
25722-25970), ORF M (nt 25981-26769) and ORF N 
(nt 26782-27915). The predicted translation products 
are the spike glycoprotein (160 kDa), the envelope protein 
(9-4 kDa), the membrane protein (29-8 kDa) and the 
nucleocapsid protein (42-7 kDa), respectively. Phylogenetic 
analysis shows that the FIPV WSU-79/1146 non-structural 
and structural proteins are related closely to their Trans- 
missible gastroenteritis virus (TGEV) homologues, less 
closely to their Human coronavirus 229E (HCoV-229E) 
homologues and most distantly to their Murine hepatitis 
virus (MHV) and Infectious bronchitis virus (IBV) homo- 
logues. These data are consistent with the accepted phylo- 
geny of coronaviruses that places FCoV in subgroup G1-1 of 
coronavirus group 1 (Gonzalez et al., 2003). 


Translation of the coronavirus polyproteins ppla and 
pplab is coupled with extensive proteolytic processing by 
virus-encoded papain-like cysteine proteinases (PL1?"° and 
PL2P"°) and a 3C-like cysteine protease (3CI?"° or main 
proteinase) (Ziebuhr et al., 2000). The conservation of 
both the positions and sequences of PL1P"°/PL2P"° and 
3ClP"° cleavage sites allows their location in the FIPV WSU- 
79/1146 polyproteins to be predicted (Table 1). These 
predictions support the reported substrate specificity of 
the FIPV 3CI]P"® (Hegyi & Ziebuhr, 2002) and indicate that, 
as for other coronaviruses, there are 11 3CIP"° cleavages in 
total in ppla/pplab. With three PI?"° cleavage sites in pp1la/ 
pplab, this means that the replicase polyproteins can be 
processed into a total of 16 non-structural polypeptides 
(nsp1-nsp16). 


Comparison of the sequences of the non-structural pro- 
teins of FIPV WSU-79/1146 with those of TGEV, HCoV- 
229E, MHV and IBV (Table 1) is, again, consistent with the 
accepted phylogeny and, as has been observed for many 
coronaviruses, shows that the polypeptides encoded by ORF 
1b are more highly conserved than those encoded by ORF 
la. Broadly, this has been interpreted to reflect the linear 
organization of non-structural proteins that have essential 
replicative functions downstream of the 3Cl?"° domain, and 
a number of non-structural proteins that have diverged, 
perhaps due to host-specific adaptations, located upstream 
of the 3CIP"° domain. It is noticeable that the FIPV WSU-79/ 
1146 nsp1 and nsp5 polypeptides appear to have diverged 
less from their ancestral homologues than the other amino- 
proximal polypeptides of ppla/pp lab. 


The functions associated with the FIPV WSU-79/1146 non- 
structural proteins can be predicted by comparison with 
other coronaviruses. On the basis of bioinformatic analysis, 
Gorbalenya and colleagues (Snijder et al., 2003, 2005) have 
proposed enzymic activities for seven of the coronavirus 
non-structural proteins (nsp3, nsp5, nsp12, nsp13, nsp14, 
nsp15 and nsp16) and four of these (nsp3, nsp5, nsp13 and 
nsp15) have been confirmed by experiment (Ziebuhr, 2005). 
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Table 1. Predicted FIPV WSU-79/1146 replicase-cleavage products 


Comparisons of predicted FIPV non-structural proteins with those of TGEV-Purdue (GenBank accession no. NC_002306), HCoV-229E 
(NC_002645), MHV-A59 (NC_001846) and IBV-Beaudette (NC_001451) were done by using MEGALIGN (DNASTAR, Jotun Hein method 
expressed as percentage amino acid identity). Protease-cleavage sites within the replicase polyprotein were predicted by alignment with the 


replicase polyproteins of TGEV, MHV and IBV. Abbreviations: nsp, non-structural protein; TI, translation initiation; TT, translation termi- 
nation; RFS, ribosomal frameshift; PL?™°, papain-like proteinase; 3ClP"°, 3C-like proteinase; ADRP, ADP-ribose 1”-phosphatase; ssRNA, 
single-strand RNA; RdRp, RNA-dependent RNA polymerase. 


Cleavage Polyprotein Position in Size Expression Amino acid identity (%) Putative 
product polyprotein (aa) TGEV Cay ane ae function(s) 
(amino acid 
residues) (29H) 
nsp1 ppla/pplab — 1Met-Gly110 110 TI+PLPre 93-6 28-2 13-6 = 
nsp2 ppla/pplab — 111Ala—Gly879 769 PLPre 78-8 35-2 15-3 13-8 
nsp3 ppla/pplab = 880Gly-Gly2336 1457 ~PLPt 79-0 34-2 18-7 14:5 PLP*(s), ADRP 
nsp4 ppla/pplab = 2337Ser—Gln2826 490 PLPF° +. 3C]Pr° 87:7 50:8 30°8 24:7 
nsp5 ppla/pplab = 2827Ser—G]n3128 302  3c\}Pr° 93-1 60-7 46-8 43-3, 3CLPr° 
nsp6 ppla/pplab = 3129Ser—G1n3422 294 = 3c]Pr° 78-2 40-6 26°8 236 
nsp7 ppla/pplab 3423Ser—G1n3505 83 3c]Pr° 96-4 67°9 40°5 40:5 
nsp8 ppla/pplab —3506Ser—G1n3700 195 3c]Pr° 92:9 61:7 42-9 42-2 
nsp9 ppla/pplab 3701 Asn—G1n3811 111 3c]Pre 89-3 59-1 38-2 33-6 — ssRNA binding 
nsp10 ppla/pplab 3812Ala—G1n3946 135. 3c|Pr° 92-6 70-6 58:1 53-7 
nsp11 ppla 3947Gly—Asp3965 19 3C)P°+4+TT 
nsp12 pplab 3947Gly-GIn4876 929 RES +3CIPr° 97°8 74:6 59-2 59-6 ~~ RdRp 
nsp13 pplab 4877Ala—GIn5475 599  RFS+3C\Pr° 99-0 74:9 59-7 58:4 Helicase 
nsp14 pplab 5476Ala—GIn5994 519 RFS+3C)P"° 99-0 71-6 51-8 51:5 Exonuclease 
nsp15 pplab 5995Ser—G1n6333 339 RFS +3C)P"° 98-2 65-6 42-7 36:8 — Endoribonuclease 
nsp16 pplab 6334Ser—Pr06633 300 RFS+3CIP°+TT 97-7 70-1 55-4 50-8  2'-O-methyltransferase 


These enzymic activities are listed as putative functions 
of the FIPV WSU-79/1146 non-structural proteins in 
Table 1. 


Detailed analysis of the predicted amino acid sequences of 
the FIPV WSU-79/1146 structural proteins reveals that they 
show the features characteristic of other coronavirus spike, 
envelope, membrane and nucleocapsid proteins. These fea- 
tures are listed in Supplementary Table S3, available in JGV 
Online. Additionally, all coronaviruses encode a number of 
proteins that are thought to be dispensable for replication in 
cell culture, but apparently provide a selective advantage in 
vivo. The genes encoding these so-called ‘accessory’ proteins 
are usually located in distinct clusters, downstream of the 
replicase-protein genes. In the case of FIPV WSU-79/1146, 
two regions of the genome that encode putative accessory 
proteins have been identified in previous studies (Haijema 
et al., 2003, 2004). One is located between the S and E 
protein genes and one is located downstream of the N 
protein gene; they are known as ORFs 3abc and ORFs 7ab, 
respectively. Analysis of the sequence of the ORF 7ab region 
of the FIPV WSU-79/1146 genome reported here allows us 
to identify an ORF that corresponds to the previously 
recognized ORF 7a. It is not possible to identify an ORF that 
would correspond to ORF 7b. However, if a single nucleo- 
tide change were permitted (namely, if nt U2g374 was 
replaced with C 374), it would be possible to restore a 


single, large ORF that would correspond to the previously 
recognized ORF 7b. In the case of the ORF 3abc region, it is 
possible to identify ORFs that correspond to the previously 
recognized ORFs 3a and 3b (Haijema et al., 2003). ORF 3c is 
not apparent. However, we note that, with two additional 
nucleotide insertions, it would be possible to extend ORF3b 
to a position that overlaps with the downstream ORF E. 
Further experiments will be needed to identify the trans- 
lation products of both the ORF 7ab and ORF 3abc regions 
of the FIPV WSU-79/1146 genome, as well as for isolates 
that have not been propagated in cell culture for extended 
periods of time. 


As described above, it is the TRS elements that determine 
the fusion sites of leader and ‘body’-derived sequences of 
coronavirus mRNAs, and the number of TRS elements 
correlates with the number of subgenomic mRNAs pro- 
duced by a particular virus. The TRS sequence for FIPV has 
been identified as containing the motif 5’-CUAAAC-3’ (de 
Groot et al., 1988) and our sequence analysis shows that 
this motif occurs 11 times in the FIPV WSU-79/1146 
genome. de Groot et al. (1987) have shown previously that at 
least five subgenomic mRNAs are produced in FIPV WSU- 
79/1146-infected cells and our analysis suggests that six 
are produced (Fig. 1). As has been pointed out by others 
(Zuniga et al., 2004), it is clear from these data that, although 
the TRS core motif is essential for the discontinuous 
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RNA(kb) 1 2. 3 RNA (kb) 


4 (31:3) > «1 (29:3) 
2(9:7)> «2 (9-2) 
3 (7:6)> 

— <3 (4-7) 
09: ume: 83 
6 (2:4)> - a 
7(13)> = 

—_ <7 (1-4) 


Fig. 1. Northern blot analysis of poly(A)-containing RNA from 
FIPV WSU-79/1146-infected CrFK cells. A °*P-labelled probe 
was used to detect the genomic and subgenomic mRNAs of 
FIPV WSU-79/1146 (lane 2) or MHV A59 (lane 1). FCoV 
RNAs were detected by hybridization with a 986 bp, «-°?P 
random prime-labelled (Megaprime; Amersham Biosciences) 
PCR product corresponding to sequences in ORF 7 and the 
3’ UTR of the FCoV genome. Poly(A)-containing RNA from 
MHV-infected cells was a kind gift from Dr H. Stokes and was 
detected by hybridization with a 466 bp, «-°2P random prime- 
labelled PCR product corresponding to sequences in the N 
protein ORF of the MHV genome. The sizes of the RNAs are 
indicated in kb. Lane 3 contains the same material as lane 2, 
electrophoresed on a 1:3% agarose gel to increase resolution 
in the region of mRNAs 4 and 5. 


extension of coronavirus (—)-strand templates, it is not 
sufficient. 


The final analysis that we undertook in this study was to 
determine the sequence of the TRS—body ‘junctions’ for 
each of the intracellular mRNAs of FIPV WSU-79/1146. 
To do this, the 5'-proximal regions of mRNAs 2-7 were 
amplified by RT-PCR and sequenced. The conclusions are 
summarized in Fig. 2. The analysis confirms the minimal 
TRS core sequence of FIPV WSU-79/1146 as 5'-CUAAAC- 
3’. The fact that the abundance of the subgenomic mRNAs 
does not correlate with the potential for base pairing 
between the leader and complement of the body TRS 
elements indicates, again, that additional factors (e.g. pro- 
teins) must be involved in the process of discontinuous 
extension that takes place during synthesis of the sub- 
genomic (—)-strand templates. 


This study provides the first comprehensive analysis of the 
FIPV WSU-79/1146 genome sequence, including a complete 
consensus sequence for the non-structural protein-coding 
region of the genome. Thus, it is now possible to predict 
the primary sequence of the full complement of FCoV non- 
structural proteins and oligonucleotide primers can be 
designed for the cloning and expression of each of these 


mRNA 1 
mRNA 2 
mRNA 3 
mRNA 4 
mRNA 5 
mRNA6 2¢ 
mRNA 7 


GUUL 


UGUAACUABACUUUCAAAUG 
ACUAAACGCAUG 


Fig. 2. Alignment of FIPV WSU-79/1146 TRS elements. 
Nucleotides matching the leader TRS are underlined. These 
sequences represent the leader—body junctions of the sub- 
genomic mRNAs. The minimal consensus sequence, or ‘core 
motif’ of the TRS, that is present in all functional TRS elements 
is shaded. The initiation codons for the mRNA translation pro- 
duct(s) are shown in bold. 


genes. The availability of recombinant forms of the FIPV 
WSU-79/1146 non-structural proteins will help to provide 
a more detailed understanding of their structure and func- 
tion. Secondly, this study predicts a number of cis-acting 
RNA elements in the genome that may be involved in FCoV 
replication and transcription. Directed mutagenesis and 
structural methods can now be used to investigate the 
structure—function relationships of these elements. Thirdly, 
the genomic sequence of FIPV WSU-79/1146 can now be 
compared with the sequence of RNA from clinical isolates 
or RNA amplified from clinical material. This sort of 
information will have, for example, important implications 
for development of prophylactic or therapeutic strategies 
to control or prevent FCoV infections. 
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